Wireless communication terminals

ABSTRACT

The audio input system ( 6 ) of a terminal, such as a mobile telephone, is powered up during the standby mode, either with the paging channel or any other short duration channel such as a monitoring channel, and is then used to recognise narrow bandwidth sounds such as a whistle, to activate the telephone. Once activated, the telephone may then be responsive to voice commands and may then support a speaker phone mode of application.

This invention relates to wireless communication terminals, especiallymobile telephones, and the hands-free activation of such terminals.

It is known to incorporate voice recognition software in mobiletelephones to allow users to dial a caller by name. However, in order tomake use of this facility, the telephone has to be operated manuallybecause, even when in the standby mode, the audio system is not normallyturned on. Instead, the receiver only is powered up to receive thepaging channel to check for incoming call requests, and for reasons ofpower saving, the audio system remains turned off.

According to the invention, a wireless communication terminal is adaptedso that it is capable of recognising a predetermined sound in thevicinity of the terminal and its audio input system is powered onperiodically when the terminal is in the standby mode and serves toactivate the terminal if said predetermined sound is recognised.

Preferably, the audio input system is powered up with the pagingchannel, and preferably only operates during the paging channel forreasons of power saving, and then processes the received audio signal torecognise said predetermined sound if it is present. In a DSP based GSMterminal, the same DSP processor is used for the radio modem and audioprocessing, and therefore powering up the processor for paging willautomatically make the audio processing function available and producesaid audio signal if the audio input system is also powered up.

The paging channel in a mobile telephone consists of a number of pagingblocks of short duration separated by an interval of 0.5 to 2.5 seconds.For example, a GSM terminal has a paging channel of four data blocks orbursts, each 4.615 ms long. Each burst has a portion allocated to radiomodem processing and the remainder allocated to audio processing, whichover four bursts might total 16 ms. Thus, the audio input system of aGSM terminal according to the invention has to recognise saidpredetermined sound over a short interval of about 16 ms, which would bedifficult for a speech pattern. Preferably, therefore, the soundselected is a whistle, which has a narrow bandwidth characteristic andchanges only slowly with time so that it can be more easily recognisedfrom a short sample. Also, a whistle can be more easily distinguishedfrom other sounds and will therefore avoid false responses.

The invention is therefore based on the fact that sound recognition is auseful function that can be switched on periodically in a mobiletelephone during the standby mode, either with the paging channel or anyother short duration channel such as a monitoring channel, and can thenbe used to recognise narrow bandwidth sounds such as a whistle, toactivate the telephone. Once activated, the telephone may then beresponsive to voice commands and may then support a speaker phone modeof application.

The invention will now be described by way of example with reference tothe accompanying drawings in which:

FIG. 1 is a schematic diagram of the major functions of a GSM mobiletelephone terminal;

FIG. 2 is a schematic diagram of successive data frames or bursts in aGSM mobile telephone system; and

FIG. 3 is a graph showing the power spectrum of normal speech and awhistle.

A typical GSM mobile terminal, as illustrated in FIG. 1, comprises aradio module 1 for receiving and transmitting radio signals inrespective receiving and transmitting paths RX, TX, a modem 2 to processthe signals in the receive and transmit paths, a channel coder 3 toprocess signals in transmit and receive channels and a speech coder 4 toprocess speech signals which are either output to a speaker module 5 orreceived from a microphone module 6. It will be appreciated that themodem 2, channel coder 3 and speech coder 4 are normally incorporated inone digital signal processor DSP, and a rechargeable battery power unit7 supplies power to all of the above components.

When such a GSM mobile terminal is in the standby mode, the power unit 7only powers up the radio module 1 and DSP on a low duty cycle to receivea paging radio channel to check whether an incoming call is beingrequested. The speaker module 5 and microphone module 6 are not poweredup in the standby mode in order to save power until such time as theymay be required.

The paging channel in GSM consists of four data frames or bursts, each4.615 ms long, as shown in FIG. 2. The DSP is therefore powered up forabout 18.5 ms, and this is repeated at an interval of 2.1 seconds.During each burst, the DSP is only processing data relating to the radiomodem function, and this only occupies a minor part of the burst, theremaining major part of the burst being reserved for audio processingwhen the terminal is in call. The total reserved time for audioprocessing between four bursts totals about 16 ms, and it is a featureof the invention, that this reserved audio processing time is used bypowering up the microphone module 6 during this time so that the audioinput it generates is processed and compared with a predetermined audioinput which is indicative of a “wake up” command from the user.

Said predetermined audio input is preferably a whistle, this having anarrow bandwidth characteristic which makes it more easily recognisablefrom a short sample, as illustrated in FIG. 3. The graph shows typicalpower spectra for both a whistle and normal speech, and illustrates thefact that a whistle is essentially a fairly pure single audio tone,whereas speech contains significant power in more bands across therange. Thus, whistles can be detected from only a short time periodbecause they are easily distinguished from other sounds such asbackground acoustic noise, which has no sharp peaks, speech which hasmultiple “formant” frequencies, and music, which like speech hasmultiple frequencies present.

It is not necessary that the whistle is of a particular pitch or eventhat the pitch is held constant with time. The recognition algorithmwould merely take a snapshot of the signal and look for a singlenarrow-band peak much higher than the surrounding signal at otherfrequencies.

The key feature of the whistle is that it is narrow-band at all times;it is therefore not necessary to scan for it continuously in order todetect it. The GSM paging cycle allowing 16 ms samples of speech at amaximum of 2.1 s intervals is therefore sufficient for whistlerecognition.

In a simple implementation, it would be necessary for the user to keepwhistling for this maximum interval of 2.1 s to ensure that at least oneblock of audio samples is captured. However, if it turns out that thisis too long to maintain a whistle, then the whistle length could bereduced with an increase in power consumption.

A suitable whistle recognition algorithm needs to detect a narrow-bandsignal of unknown frequency in the presence of speech with low falsealarm probability. A pre-shaping filter would be provided to remove lowfrequency components from the signal which would otherwise affect therecognition process.

Reasonable recognition/false alarm results have been obtained using thefollowing algorithm:—

-   (i) If the energy of the block of audio samples is above a threshold    then take the FFT for 128 samples sampled at 8 kHz;-   (ii) find the largest energy bin and find the width of the peak to    half the peak power;-   (iii) find the next largest peak excluding the interval found in    (ii);-   (iv) if the ratio of the energy in the first peak of the second peak    is >10 dB then declare that the whistle has been recognised.

An alternative non-linear approach is based on the low variance of thephase increment per sample in the audio block for a whistle comparedwith speech.

Although the algorithm has been discussed in terms of GSM, it will beappreciated that it can be generalised for any wireless communicationssystem. The only requirement is the capability to periodically switch onthe audio hardware to sample 16 ms of audio data. All mobile phonesystems should fulfil this requirement since the mobile will need toswitch itself on periodically either to listen for paging signals (ortheir equivalent) or for network measurements, and being a phone itshould have the appropriate audio capabilities. As long as this dutycycle is sufficient, the algorithm need not be modified.

In one embodiment of the invention, a mobile terminal is further adaptedto include voice dialling and speaker phone operation. The user is thenable to use the terminal in hands-free mode as follows:

-   (i) user whistles;-   (ii) terminal responds with an acknowledgement, probably audible,    e.g. a beep or some pre-recorded message or tune;-   (iii) user says the voice command, e.g. a name to be dialled;-   (iv) user engages in the call (using speaker phone operation)—or    executes whatever other command has been pre-programmed.

Speaker phone operation with a mobile terminal requires a loud audiooutput and some form of echo control.

1. A radio communication terminal comprising a radio module forprocessing radio signals, a processor for processing digital signalsassociated with the radio signals, an audio generator adapted togenerate an audio input signal to the processor in response to sound inthe vicinity of the terminal, a power supply, and a power controller tocontrol connection of the power supply to the radio module and having astandby mode in which the radio module and processor are energisedperiodically to detect a radio channel, characterised in that the powercontroller energises the audio generator to generate an audio input tothe processor only during the radio channel, and the processor isadapted to respond to a predetermined sound by activating said terminalfor communication.
 2. A terminal as claimed in claim 1 in which theprocessor processes digital signals from the radio module during one ormore successive data bursts of the radio channel.
 3. A terminal asclaimed in claim 2 in which the radio channel is a paging channel.
 4. Aterminal as claimed in claim 1 any one of the preceding claims in whichsaid predetermined sound comprises a narrow-band sound.
 5. A terminal asclaimed in claim 4 in which said predetermined sound comprises awhistle.
 6. A terminal as claimed in claim 1 in which the processorincorporates a sound recognition algorithm which distinguishes saidpredetermined sound from speech in the audio input signal.
 7. A terminalas claimed in claim 6 in which the recognition algorithm is adapted todetect total energy in the audio input signal above a predeterminedthreshold.
 8. A terminal as claimed in claim 7 in which the recognitionalgorithm is adapted to detect multiple energy peaks at differentfrequencies in the audio input signal, and to compare the energy inthese peaks.
 9. A terminal as claimed in any one of the preceding claimswhich includes a pre-shaping filter to filter out low frequencycomponents from the audio input signal before it is processed by theprocessor.
 10. A terminal as claimed in any one of claims 1 to 6 inwhich the recognition algorithm is adapted to detect low variance of thephase increment per sample in an audio block for said predeterminedsound compared with speech.
 11. A terminal as claimed in any one of thepreceding claims in which the terminal responds to said predeterminedsound by generating an audible response.
 12. A terminal as claimed inany one of the preceding claims which is adapted to recognise speechcommands for setting up calls from the terminal.
 13. A terminal asclaimed in any one of the preceding claims which is adapted for speakerphone operation.